Picture for Ion Stoica

Ion Stoica

Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving

Add code
May 06, 2025
Viaarxiv icon

Sleep-time Compute: Beyond Inference Scaling at Test-time

Add code
Apr 17, 2025
Viaarxiv icon

R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents

Add code
Apr 09, 2025
Viaarxiv icon

HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUs

Add code
Apr 04, 2025
Viaarxiv icon

Bandwidth Allocation for Cloud-Augmented Autonomous Driving

Add code
Mar 26, 2025
Viaarxiv icon

Why Do Multi-Agent LLM Systems Fail?

Add code
Mar 17, 2025
Viaarxiv icon

WorldModelBench: Judging Video Generation Models As World Models

Add code
Feb 28, 2025
Viaarxiv icon

Prompt-to-Leaderboard

Add code
Feb 20, 2025
Viaarxiv icon

S*: Test Time Scaling for Code Generation

Add code
Feb 20, 2025
Viaarxiv icon

Optimizing Model Selection for Compound AI Systems

Add code
Feb 20, 2025
Viaarxiv icon